NVIDIA Unveils Nemotron ASR for Low-Latency Applications
Explore NVIDIA's new Nemotron Speech ASR model designed for voice agents and live captioning with low-latency performance.
Records found: 10
Explore NVIDIA's new Nemotron Speech ASR model designed for voice agents and live captioning with low-latency performance.
A hands-on guide to building a compact pipeline with SpeechBrain that generates speech, adds noise, enhances audio with MetricGAN+, and measures ASR word error rates before and after denoising
'Qwen3-ASR Flash from Alibaba is a single-model ASR that auto-detects and transcribes 11 languages, supports context injection for domain terms, and keeps WER below 8% in noisy or musical audio.'
'AI2 released OLMoASR, an open ASR suite that includes models, training data identifiers, filtering recipes, and benchmarks, and competes closely with OpenAI Whisper across multiple tasks.'
'A concise guide to the 20 best voice AI blogs and news sites for 2025, covering research, product launches, ethics, and market trends to help developers and leaders stay informed.'
'OpenAI released GPT-Realtime and Realtime API with unified audio processing, SIP phone support and MCP server integration, improving performance and enterprise deployment options while key speech AI challenges remain.'
Amazon researchers created an AI architecture that cuts inference time by 30% by activating only task-relevant neurons, inspired by the brain's efficient processing.
NVIDIA's Canary-Qwen-2.5B model sets a new benchmark in speech recognition with a record low Word Error Rate and fast processing speed. This open-source, commercially licensed hybrid ASR-LLM model enables advanced audio transcription and language understanding.
Mistral AI launches Voxtral, cutting-edge open-weight speech recognition models that integrate transcription and language understanding with support for long audio contexts and multiple languages.
Chinese researchers release LLaMA-Omni2, a modular speech language model that enables real-time spoken dialogue with minimal latency and strong performance using compact training data.